Domain adaptation aims to transfer the knowledge acquired by models trained on (data-rich) source domains to (low-resource) target domains, for which a popular method is invariant representation learning. While they have been studied extensively for classification and regression problems, how they apply to ranking problems, where the data and metrics have a list structure, is not well understood. Theoretically, we establish a domain adaptation generalization bound for ranking under listwise metrics such as MRR and NDCG. The bound suggests an adaptation method via learning list-level domain-invariant feature representations, whose benefits are empirically demonstrated by unsupervised domain adaptation experiments on real-world ranking tasks, including passage reranking. A key message is that for domain adaptation, the representations should be analyzed at the same level at which the metric is computed, as we show that learning invariant representations at the list level is most effective for adaptation on ranking problems.
translated by 谷歌翻译
Large language models (LLMs) have shown impressive results across a variety of tasks while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in this setting. We propose and study Attributed QA as a key first step in the development of attributed LLMs. We develop a reproducable evaluation framework for the task, using human annotations as a gold standard and a correlated automatic metric that we show is suitable for development settings. We describe and benchmark a broad set of architectures for the task. Our contributions give some concrete answers to two key questions (How to measure attribution?, and How well do current state-of-the-art methods perform on attribution?), and give some hints as to how to address a third key question (How to build LLMs with attribution?).
translated by 谷歌翻译
Nowadays, Multi-purpose Messaging Mobile App (MMMA) has become increasingly prevalent. MMMAs attract fraudsters and some cybercriminals provide support for frauds via black market accounts (BMAs). Compared to fraudsters, BMAs are not directly involved in frauds and are more difficult to detect. This paper illustrates our BMA detection system SGRL (Self-supervised Graph Representation Learning) used in WeChat, a representative MMMA with over a billion users. We tailor Graph Neural Network and Graph Self-supervised Learning in SGRL for BMA detection. The workflow of SGRL contains a pretraining phase that utilizes structural information, node attribute information and available human knowledge, and a lightweight detection phase. In offline experiments, SGRL outperforms state-of-the-art methods by 16.06%-58.17% on offline evaluation measures. We deploy SGRL in the online environment to detect BMAs on the billion-scale WeChat graph, and it exceeds the alternative by 7.27% on the online evaluation measure. In conclusion, SGRL can alleviate label reliance, generalize well to unseen data, and effectively detect BMAs in WeChat.
translated by 谷歌翻译
健康素养是2030年健康人民的主要重点,这是美国国家目标和目标的第五次迭代。健康素养较低的人通常会遵循访问后的说明以及使用处方,这会导致健康结果和严重的健康差异。在这项研究中,我们建议通过自动在给定句子中翻译文盲语言来利用自然语言处理技术来提高患者教育材料的健康素养。我们从四个在线健康信息网站上刮擦了患者教育材料:medlineplus.gov,drugs.com,mayoclinic.org和reddit.com。我们分别在银标准培训数据集和黄金标准测试数据集上培训并测试了最先进的神经机译(NMT)模型。实验结果表明,双向长期记忆(BILSTM)NMT模型的表现超过了来自变压器(BERT)基于NMT模型的双向编码器表示。我们还验证了NMT模型通过比较句子中的健康文盲语言比率来翻译健康文盲语言的有效性。提出的NMT模型能够识别正确的复杂单词并简化为外行语言,同时该模型遭受句子完整性,流利性,可读性的影响,并且难以翻译某些医学术语。
translated by 谷歌翻译
手写数学表达识别(HMER)是具有许多潜在应用的挑战性任务。 HMER的最新方法通过编码器架构实现了出色的性能。但是,这些方法符合“从一个字符到另一个字符”进行预测的范式,由于数学表达式或厌恶的手写的复杂结构,这不可避免地会产生预测错误。在本文中,我们为HMER提出了一种简单有效的方法,该方法是第一个将语法信息纳入编码器编码器网络的方法。具体而言,我们提出了一组语法规则,用于将每个表达式的乳胶标记序列转换为一个解析树。然后,我们将标记序列预测建模为具有深神经网络的树遍布过程。通过这种方式,提出的方法可以有效地描述表达式的语法上下文,从而减轻HMER的结构预测错误。在三个基准数据集上的实验表明,与先前的艺术相比,我们的方法实现了更好的识别性能。为了进一步验证我们方法的有效性,我们创建了一个大规模数据集,该数据集由从一万个作家中获取的100k手写数学表达图像组成。该工作的源代码,新数据集和预培训的模型将公开可用。
translated by 谷歌翻译
散焦模糊是图像中经常看到的一种模糊效果,这是由于其空间变体的量而挑战。本文介绍了一种用于从单个图像中移除散焦模糊的端到端深度学习方法,以便具有随之而来的视觉任务的全焦点图像。首先,提出了一种用于以有效的线性参数形式表示空间变体散焦模糊核的像素 - WISE高斯核混合物(GKM)模型,其比现有模型更高。然后,通过展开基于GKM的去纹理的定点迭代来开发称为GKMNet的深神经网络。 GKMNET构建在轻量级刻度 - 重复间体系结构上,具有比例 - 复制注意力模块,用于估计GKM中的混合系数用于散焦去孔。广泛的实验表明,GKMNET不仅明显优于现有的散焦去纹理方法,而且还具有其在模型复杂性和计算效率方面的优势。
translated by 谷歌翻译
深度神经网络(DNN)受到对抗的示例攻击的威胁。对手可以通过将小型精心设计的扰动添加到输入来容易地改变DNN的输出。对手示例检测是基于强大的DNNS服务的基本工作。对手示例显示了人类和DNN在图像识别中的差异。从以人为本的角度来看,图像特征可以分为对人类可易于理解的主导特征,并且对人类来说是不可理解的隐性特征,但是被DNN利用。在本文中,我们揭示了难以察觉的对手实例是隐性特征误导性神经网络的乘积,并且对抗性攻击基本上是一种富集图像中的这些隐性特征的方法。对手实例的难以察觉表明扰动丰富了隐性特征,但几乎影响了主导特征。因此,对抗性实例对滤波偏离隐性特征敏感,而良性示例对这种操作免疫。受到这个想法的启发,我们提出了一种仅称为特征过滤器的标签的侵略性检测方法。功能过滤器利用离散余弦变换到占主导地位的大约单独的隐性功能,并获得默认隐性功能的突变图像。只有在输入和其突变体上进行DNN的预测标签,特征过滤器可以实时检测高精度和少量误报的难以察觉的对抗性示例。
translated by 谷歌翻译
文本到图像综合的目标是生成与给定文本描述匹配的视觉现实图像。在实践中,人类注释的标题在同一图像中具有很大的内容方差和单词的选择。相同图像的标题之间的语言差异导致偏离地面真理的合成图像。为了解决这个问题,我们提出了一种对比的学习方法来提高质量,增强合成图像的语义一致性。在预先预测阶段,我们利用对比的学习方法来学习对应于相同图像的标题的一致文本表示。此外,在GaN训练的以下阶段,我们采用对比学习方法来增强来自与相同图像相关的标题的所生成的图像之间的一致性。我们分别评估了我们在数据集幼崽和Coco上的两个流行文本到图像综合模型,ATTNGAN和DM-GAN的方法。实验结果表明,我们的方法可以有效地提高三个度量的合成图像的质量:是,FID和R精度。特别是,在挑战的Coco DataSet上,我们的方法将FID显着地通过29.60%的Attngan来增强29.60%,并在DM-GaN中达到21.96%。
translated by 谷歌翻译
对抗性实例是精心设计的输入样本,其中扰动是人眼不可察觉的,而是容易误导深神经网络(DNN)的输出。现有的作品通过利用简单的指标来惩罚扰动,缺乏对人类视觉系统(HV)的充分考虑,这产生了明显的伪像。为了探索为什么扰动是可见的,本文总结了影响人眼性灵敏度的四个主要因素。基于这一调查,我们设计了一种用于测量良性示例和对抗的感知损失的多因素度量Mulorloss。为了测试多因素度量的难以察觉,我们提出了一种新的黑匣子逆势攻击,被称为贪婪。 Greedyfool应用差分演变,以评估扰动像素对目标DNN的置信度的影响,并引入贪婪近似以自动产生对抗扰动。我们对Imagenet和CIFRA-10数据集进行了广泛的实验,以及60名参与者的全面用户学习。实验结果表明,Mulfactorloss是比现有的Pixelive度量更难以察觉的公制,并且贪婪汇率以黑盒方式实现了100%的成功率。
translated by 谷歌翻译
In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.
translated by 谷歌翻译